Automatic Moderation of Online Discussion Sites

نویسندگان

  • Jean-Yves Delort
  • Bavani Arunasalam
  • Cécile Paris
چکیده

Online discussion sites (ODS) are plagued with various types of unwanted content such as spam, obscene and malicious contents. Prevention and detection-based techniques have been proposed to filter inappropriate content (IC) out from ODS. But, while prevention techniques have been widely adopted, detection of IC remains mostly a manual task. Existing detection techniques, which are divided into rule-based and statistical techniques, suffer from various limitations. Rule-based techniques usually consist of manually crafted rules or blacklists of keywords. Both are time-consuming to create and tend to generate too many false-positives and false-negatives. Statistical techniques typically use corpora of labeled examples to train a classifier to tell “good” and “bad” messages apart. Although statistical techniques are generally more robust than rule-based techniques, they are difficult to deploy because of the prohibitive cost of manually labeling examples. In this paper, we describe a novel classification technique to train a classifier from a partially labeled corpus and use it to moderate IC in ODS. Partially labeled corpora are much easier to produce than completely labeled corpora, as they are only made 1 up with unlabeled examples and examples labeled with a single class (e.g. “bad”). We implemented and tested this technique on a corpus of messages posted on a stock message board and compared it with two baseline techniques. Results show that our method outperforms the two baselines and that it can be used to significantly reduce the number of messages that need to be reviewed by human moderators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simulation for Designing Online Community: Member Motivation, Contribution, and Discussion Moderation

This article describes and validates an agent-based model that integrates social psychological theories on collective effort, group identity, and interpersonal bonds to understand trade-offs in designing online communities. The model is then used to examine when different types of moderation in online communities will be valuable: no moderation in which all members are exposed to all messages, ...

متن کامل

Everything in Moderation 1 Everything in Moderation: A case for the balanced moderation of user-generated content on news sites

Moderation of user-generated content on news Web sites is an increasingly relevant and pertinent topic for online news entities. The quality and quantity of user-generated content can either help or hinder the number of audience members a news outlet receives. Considerations such as the amount of resources that can be given to moderation, the types of moderation, the types of usergenerated cont...

متن کامل

Under Review at Information Systems Research Please do not cite without author permission Agent-Based Modeling to Inform Online Community Theory and Design: Impact of Discussion Moderation on Member Commitment and Contribution

In this article, we advocate a new approach in theory development by translating and synthesizing insights from multiple social science theories in an agent-based model to understand challenges in building online communities. To demonstrate the utility of this approach, we use it to examine the effects of three types of discussion moderation in conversation-based communities: no moderation, in ...

متن کامل

Agent-Based Modeling to Inform Online Community Design: Impact of Topical Breadth, Message Volume, and Discussion Moderation on Member Commitment and Contribution

The design of complex social systems, such as online communities, requires the consideration of many parameters, a practice at odds with social science research that focuses on the effects of a small set of variables. In this paper, we examine three design decisions—topical breadth, message volume, and discussion moderation—and the trade-offs involved in making these decision. We show how synth...

متن کامل

Moderated Online Communities and User - Generated Content

Online communities provide a social sphere for people to share information and knowledge. While information sharing is becoming a ubiquitous online phenomenon, how to ensure information quality or induce quality content, however, remains a challenge due to the anonymity of commentators. This paper introduces moderation into reputation systems. We show that moderation directly impacts strategic ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Electronic Commerce

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2011